REMARKS/ARGUMENTS 

Claims 1-13 are pending in the present application. Claims 1, 5 and 13 are rejected and 
claims 2-4 and 6-12 are objected to in the present Office Action. Claim nos. 1, 5 and 13 are, thus 
amended herein to more clearly define the present invention. These amendments are supported 
by the application as originally filed and thus there is no issue of new matter raised by the 
amendments. Entry of this Amendment into the file of the present application is respectfully 
requested as it is believed to place the entire application in condition for allowance or, at a 
minimum, to materially reduce the issues for an appeal. 

Rejections Under 35 U.S.C. 112, Second Paragraph 

Claims 1 and 13 are rejected under 35 U.S.C. 1 12, second paragraph, for the reasons set 
forth on p. 2 of the Office Action. These rejections are respectfully traversed. 

In response to the above rejection, the subject claims have been amended to delete the 
language which the Examiner indicated as being 'confusing' and to more clearly describe what 
applicants deem to be their invention. These amendments are believed to overcome the 
Examiner's grounds for rejection under §112, second paragraph and the Examiner is, thus, 
respectfully requested to reconsider and withdraw the rejection of claims 1 and 13. 

Rejections Under 35 U.S.C. 112, First Paragraph 

Claims 5 and 13 are rejected under 35 U.S.C. 1 12, first paragraph as set forth at pps. 2-4 
of the Office Action. The Examiner alleges that the subject claims do not comply with the 
'enablement requirement of 35 U.S.C. 1 12, first paragraph, in that they contain subject matter 
that was not described in the specification in such a way as to enable one skilled in the art to 
which it pertains, or with which it is most nearly connected, to make and/or use the invention. 
This ground of rejection is respectfully traversed. 

In response to the rejection, applicants have amended the language of claim 5 to recite 
that the endonuclease according to the invention cleaves the DNA, "between the fourth and fifth 
bases 3 'of said particular nucleotide sequence and, in the complementary strand, between the 
fifth and sixth bases 5' of the complement of said particular nucleotide sequence". Additionally, 
claim 13 is amended to now recite that the enzyme specifically recognizes DNA at a particular 
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nucleotide sequence and that it cleaves the DNA, "between the fourth and fifth bases 3' of said 
particular nucleotide sequence and, in the complementary strand, between the fifth and sixth 
bases 5' of the complement of said particular nucleotide sequence. As amended, the subject 
claims are believed to fully meet the 'enablement requirement' of section 1 12, first paragraph 
and the Examiner is, therefore, respectfully requested to reconsider and withdraw the rejection. 

Notwithstanding the above, applicants note that the Examiner indicates in the Office 
Action that he found the arguments made in the previous response dated August 25, 2006 against 
the §112, first paragraph, rejection of claims 5 and 13 to be confusing. The following discussion 
is, therefor, intended to clarify the record with regard to the arguments made by applicants at 
pps. 7-10 in their August 25, 2006 Amendment. 

Prior to clarifying their previous arguments, however, applicants note the Examiner's 

statement on p. 3 (lines 14-17) of the present Office Action that, " [t]he second, third and 

fourth positions in the figure on page 9 of the amendment shows cleavage as being through a 
base (line through the letter), not to the left or right of the base." Applicants apologize for any 
confusion caused to this typographical error in the preparation of the indicated figure and submit 
that a corrected replacement for the subject figure, wherein the cleavage lines pass between the 
relevant bases, is included at the end of this response. The Examiner is respectfully requested to 
enter the new figure as a replacement for the original figure at page 9 of applicants' prior 
response. 

As indicated at p. 9 of the August 25, 2006 Amendment, in the present invention lambda 
phage DNA was digested with HpyClI to determine the recognition and cleavage site of 
HpyClI. Because the digested restriction fragment may have a sticky end, to facilitate the 
following ligation step, the restriction fragments are blunted by T4 DNA polymerase (see the 
figure on p. 10 of the August 25 th response) then cloned into the EcoRV site of pBR322 plasmid. 
The restriction fragment-vector junction was then sequenced. Both the sequences of pBR322 
plasmid and lambda phage DNA are already known, the inserted fragment will be known and the 
restriction fragment- vector junction will also be known. Comparing the 10 junction sequences in 
the original lambda phage DNA (as shown in the replacement for the figure originally provided 
on p. 9 of applicants' August 25 th response, which replacement figure is appended at the end of 
this response), a recognition site (5'-CCATC-3') was identified at a constant distance from the 
junction. When the recognition site (5'-CCATC-3') is located in the cloned HpyClI restriction 
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fragments, there will be 5 base pairs between the recognition site and the junction. On the other 
hand, if the recognition site (5'-CCATC-3') is not located in the cloned HpyClI restriction 
fragments, there will be only 4 base pair between the recognition site and the junction. The 
foregoing comparison thus permitted the inventors to deduce that HpyClI recognizes a sequence 
5'-CCATC-3' and cleaves DNA between the fourth and fifth bases 3' of the particular nucleotide 
sequence and, in the complementary strand, between the fifth and sixth bases 5' of the 
complement of the particular nucleotide sequence. 

Paragraph 14 of the present specification states that "a non-palindromic recognition 
sequence of 5'-CCATC-3' (designated SEQ ID NO : 1) and cleaves the fourth base 
downstream from the recognition sequence of the upper strand and the fifth base from that of 
the lower strand of SEQ ID NO : 1." Based on the above, the inventors determined that 
"downstream" as used in the specification means recognition site (5'-CCATC-3') not its inverse 
complement. Applicants further stated in their prior response that the arrow in the illustration on 
p. 9 thereof (which is to replaced by the corrected figure provided herewith) indicates the 
junction site , not the cleavage site. The Office Action indicates, however, that there is some 
confusion on the part of the Examiner regarding applicants' discussion of the "junction site". A 
further discussion concerning such 'junction site' is therefore provided herein. 

The Examiner is respectfully informed that the so-called "junction site" is not a new site 
created along the 'string' of nucleotides. It refers, in fact, to a reference site well known among 
those having ordinary skill in this art for analyzing the cleavage site of restriction enzymes. The 
use of such "junction sites" is demonstrated, for example, in Zylicz-Stachula, A., et ai, 
"TspGWI, a thermophilic class-IIS restriction endonuclease from Thermus sp., recognizes novel 
asymmetric sequence 5'-ACGGA(Nii/ 9 )-3'", Nucleic Acids Research, (2002) Vol. 30, no. 7, e33 
and Skowron, P. M., et al. 9 "A new Thermus sp. Class-IIS enzyme sub-family: isolation of a 
'twin' endonuclease TspDTI with a novel specificity 5'-ATGAA(Nn /9 ), related to TspGWI, 
Taqll and Tthl 1 III", Nucleic Acids Research (2003) Vol. 31, No. 14, e74, copies of which are 
provided with this response. As shown by these references, for purposes of analysis, a X DNA 
was digested with a restriction enzyme and the restriction fragments obtained were then 
subcloned to backbone vectors in order to proceed with a sequence analysis. Consequently, four 
types of cloning construct, in which the sticky ends were blunted by polymerase, were generated, 
as shown in the figures provided at pps. 9-10 of applicants' previous response (with the figure 
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from p. 9 now being substituted by the replacement figure provided herewith) and described at p. 
8 of applicants' Amendment dated August 25, 2006. 

As indicated, before being cloned into the pBR322 plasmid, the restriction fragments are 
blunted by T4 DNA polymerase, which conclusively demonstrates that the junction site is not the 
cleavage site. Instead, the junction site may be off from the cleavage site by a base more or less. 
Thus, the base 'beside' the cleavage site serves as a junction for ligation in a subsequent cloning 
step. It is, in fact, due to this function that the site is referred to as a "junction site" and it is not 
meant to be confused with the cleavage site. 

Applicants thus trust that the explanation provided above will serve to clarify that the 
vertical arrow shown in Table 1 on pi 1 refer to the junction site , instead of to the cleavage site. 
To avoid confusion between the junction site and the cleavage site, applicants have reconfigured 
the information found in Table 1 into a revised chart wherein a portion of the sequences listed at 
positions 1635-1596, 5009-4970, 9894-9855, 12443-12404 and 39627-39588 in Table 1 has been 
amended by replacement with the sequence of the strand reading from 5' to 3' such that, 
consequently, the junction site matches up with the cleavage site. This revision to Table 1, 
prepared by applicants to clarify the relation between the 'junction site' and the 'cleavage site', is 
provided below. 



position in lamda 
DNA 


*DNA sequences flanking HpyClI cleavage sites in lamda DNA 


1325-1364 


y-CTGGCCAAAG7CCATCCGTGlGCTCC ACGCCAAAAGTGAGA-3 1 


1635-1596 


5' -A CCA GA GAA TG CCA TC\A CGC^GTCCAGATCCCGGTCTTTTC-3' 


4797-4836 


y-TGCTCGA TA TGGA CA CCCCC^GGCGGGATGG|TGGCGGGGGC-3' 


5009-4970 


5'-AA TTA CTGTGA GCCA TCA TGl ACGCCIGATGGIaGCCTGTCCG-3' 


9581-9620 


S'-CAGTGGTA TGA CCA TCA CCGItG AACGGCGTTGCTGCAGGC-3' 


9894-9855 


5'- TCGCCA CCA GAA A CGCGCCGlGTTCT|GATGG CGTCTTCC AC-3' 


11833-11872 


y~TCCTGCAGGCGGATTACAAClACGCT\GATGGCGGCGGCGAA-3' 


12443-12404 


y-CTTCAGGCCTG 


CCATC 


C4Cr4/TCCCGCGAAGCTGGTCTTCA-3' 


39312-39351 


y-AGACTATCGCA 


CCATC 


A GCCi AG AAAACCGAATTTTGCTGG-3' 


39627-39588 


y-GTCAAAGTTAA 


CCA rC]rG7GiCGGCGATGTTTTTCATAGAT-3' 



Applicants trust that the comments provided above will serve to clarify for the Examiner 
the arguments set forth in their prior response and that the subject arguments, taken together with 
the claim amendments described herein, will lead to the reconsideration and withdrawal of the 
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'non-enablement' rejection of claims 5 and 13 under 35 U.S.C. §112, first paragraph in order that 
this application may proceed to issuance. 
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cleavage site 



5'-CCATC 
3'-GGTAG 




3' 
-5' 



first position(1325-1364) 



junction site 



5 '-CTGGCCAAA G1 ]CCA TQ CGTOGCTCC ACGCC AAAAGTGAGA-3 ' 
V-GA CCGGTTTCA GGTA G GCA cfc AGGTGCGGTTTTC ACTCT-5 ' 



cleavage site 



second position(1596-1635) 



5 ' -G AAA AG ACCGGG ATCTGGA1CCG7 GA TGG CA TTCTCTGGT-3 ' 



3'-CTTTT CTGGCCCT AG ACCTG GGCACTACCGTAA GA GA CCA-5 ' 



third position(4797-4836) 



S'-TGCTCGA TA TGG A CA CGCCqGGCG GjGATGGi TGGCGGGGGC-3 ' 
V-A CGAGCTA TA CCTGTGCGGCMCGCCCTACCACCGCCCCCG-5 ' 



fourth position(4970-5009) 



5 -CGGACAGGCT lCCATC| GGCGlC4 TGA TGGCTCA CA GTAA TT-3 ' 
3 ' -GCCTGTCCG AGGT AGCCGC 3 GTA CTA CCGA GTGTCA TTAA-5 ' 
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ABSTRACT 

The TspDTI restriction endonuclease, which shows 
a novel recognition specificity 5 , -ATGAA(N 11 / 9 )-3 / , 
was isolated from Thermus sp. DT. TspDTI appears 
to be a 'twin' of restriction endonuclease TspGWI 
from Thermus sp. GW, as we have previously 
reported. TspGWI was isolated from the same loca- 
tion as TspDTI, it recognizes a related sequence 
S'-ACGGAtNn/gj-S' and has conserved cleavage 
positions. Both enzymes resemble two other class- 
IIS endonucleases from Thermus sp.: Taqll and 
Tth111ll. N-terminai amino acid sequences of 
TspGWI tryptic peptides exhibit 88.9-100% simil- 
arity to the Taqll sequence. All four enzymes were 
purified to homogeneity; their polypeptide sizes 
(114.5-122 kDa) make them the largest class-IIS 
restriction endonucleases known to date. The exist- 
ence of a Thermus sp. sub-family of class-IIS restric- 
tion endonucleases of a common origin is herein 
proposed. 

INTRODUCTION 

Restriction endonucleases are traditionally divided into three 
major classes or types: I, II and III, with the vast majority of 
endonucleases included in class-II (1,2). 

Class-I, exemplified by EcoK and EcoB, consists of 
multimeric enzymes, each composed of three distinct types 
of subunits: S, for recognition; R, for cleavage; and M, for 
methylation. The endonuclease complex contains all three 
subunits, while the methylation complex contains only S and 
M subunits. The endonuclease recognizes 6-8 bp asymmetric, 
interrupted cognate sequence; whereas for cleavage it requires 
Mg 2+ , S-adenosylmethionine (SAM) and ATP. SAM and ATP 
are utilized both as cofactors and allosteric effectors, which 
determine, based on methylation of substrate DNA, whether 
methylation or restriction activity will be turned on. During 



this complicated sequence of events, including ATP-driven 
translocation of DNA helix, scission takes place at random and 
far from the recognition site — up to several thousands base 
pairs (3,4). 

Class-II was originally distinguished as featuring homo- 
dimeric or homomultimeric endonucleases, such as EcoRI, 
requiring Mg 2+ as the only obligatory cofactor. They 
recognize 4-8 bp palindromic sequences and cleave within 
this sequence (1). Cognate methyltransferases are monomeric 
proteins, which require only SAM for activity. 

In class-in, exemplified by EcoPI, an endonuclease con- 
tains two different subunits or a single subunit, and recognizes 
asymmetric sequences of 6 bp, cleaving outside at a distance 
of -25 bp. The enzymes require Mg 2+ as a cofactor and ATP as 
an allosteric activator. In addition, they are stimulated by 
SAM (4). 

Currently, with increasing numbers of new endonucleases 
found — over 250 specificities and several thousands of 
isoschizomers (2), traditional classification is no longer 
adequate to cover the diversity of restriction endonucleases 
known. Class-II turned out to be very heterogeneous, with 
numerous enzymes not fitting the original classification. 
Several new endonuclease categories are distinguishable, 
either temporarily described as sub-classes within class-II or 
as proposed separate types. In particular, (sub)class-IIS 
significantly differs from the class-II paradigm by recognizing 
4-7 bp asymmetric sequences, cleavage at the defined distance 
of 0-20 bp downstream (5), monomeric architecture (6-8) and 
the utilization of different mechanism of recognition and 
scission (9,10). Some class-IIS endonucleases, in addition to 
their requirement for Mg 2+ , are stimulated by SAM, although 
this is not an obligatory cofactor. This unusual mode of 
interaction with DNA prompted detailed function-structure 
studies. In particular, Fokl (11) has become a model protein 
for endonuclease-DNA interaction studies. Fokl and other so 
far characterized class-IIS endonucleases are monomers in 
solution (6,12,13), although transient dimerization during 
cleavage has been observed (9,10). Fokl and StsI both have 
two functional domains: one for binding to the cognate site 
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and another for DNA cleavage (5,7-10,14-16). Such 
modular enzyme architecture allows for remarkable protein 
engineering experiments, such as the design and construction 
of artificial chimeric restriction endonucleases, composed of 
the Fokl endonuclease cleavage domain fused with a site- 
specific DNA binding protein. Examples of this include Ubx 
homeodomain of Drosophila (17) and zinc-finger transcrip- 
tion factors fused to the C-terminal domain of Fokl (18). Such 
hybrid proteins cleaved DNA at novel sites, with target 
recognition specificities imposed by the DNA-binding protein 
fusion partners. Separation of a recognition sequence from its 
cleavage site led to the development of a universal restriction 
endonuclease, capable of cleaving a single-stranded DNA 
target at a predetermined sequence (14,19). 

Proposed class-IV is exemplified by Eco57I. The enzyme is 
composed of just a single polypeptide, which is a fusion of 
endonuclease and methyl transferase moieties. Monomeric 
enzyme recognizes an asymmetric sequence of 6 bp, but 
cleaves 16/14 bp downstream from its cognate site. It requires 
Mg 2+ and is heavily dependent on SAM (20,21). 

Type-Bcgl-like contains unusual enzymes, which form an 
asymmetric protein complex of three subunits, of which two 
are identical. Such complexes recognize 5-7 bp continuous or 
interrupted asymmetric sites and can perform both cleavage 
and methylation. Methylation requires SAM and is stimulated 
by Mg 2+ . These endonucleases cleave both upstream and 
downstream of the cognate site ( 10 /i2nCGAnnnnnnTGCn I2 /io 
in the case of Bcgl), excising the recognition site along with 
flanking sequences from the target DNA. Cleavage requires 
Mg 2+ and is stimulated by SAM (22,23). 

Type-IIG partially overlaps (sub)class-IIS, class-IV and 
Bcgl-like enzymes in that they bind to asymmetric sequences 
and cleave on one or both sides of their recognition sites. In 
addition, they are invariably stimulated by SAM, and both 
methyl transferase and restriction activity are located in the 
same polypeptide. This type is exemplified by HaelV, which is 
specific to 7/ I3 nGAYnnnnnRTCNj4/9 sequence and forms a 
homodimer (24), as opposed to monomeric class-IIS enzymes 
(6,12,13). 

Type-CviJI-like contains endonucleases of eukaryotic 
origin, found only in viruses infecting unicellular Chlorella 
algae. The enzymes resemble class-II in their homodimeric 
structure and Mg 2+ requirement as the only cofactor. However, 
they have features not found in any other class of restriction 
endonuclease: recognizing more frequent sequences than their 
prokaryotic counterparts — degenerated 4 bp (statistically 
equivalent to 3 bp), and their specificity changes in the 
presence of an adenine nucleotide derivative to cleave even 
more frequently — essentially a 2 bp sequence (25-27). 

Type-Bfil-like shares characteristics of class-IIS enzymes, 
except it does not require Mg 2+ for cleavage, which is a radical 
exception amongst restriction endonucleases of all types, and 
involves a different mechanism of DNA scission (28). 

There are more variations within these groups or putative 
classes, and even evolutionary traces of other functions 
preceding restriction of DNA can be found, such as a ligase 
moiety present in Nael. It has been shown that just a single 
mutation converted Nael to topoisomerase/recombinase (29). 

In this paper we present evidence for the existence of a 
Thermus sp. sub-family of related class-IIS restriction 
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endonucleases, which have a combination of exclusive 
features in addition to those found in class-IIS and class-IV. 



MATERIALS AND METHODS 

Bacterial strains, plasmids, media and reagents 

Thermus sp. DT was from EURx Ltd bacterial strains 
collection, isolated during the company research program. 
The bacterium is an obligatory thermophile, which grows 
between 56 and 75°C. The optimum cultivation conditions 
were at 60°C in a modified Luria broth (0.5% tryptose, 0.3% 
yeast extract, 0.2% NaCl, pH 7.2, Nitsch's trace elements) 

(30) . Escherichia coli DH11S {mcrA [mrr-hsdRMS(r K _, 
m K +)-mcrBC] A(lac-proAB) A(recA1398) deoR, rpsL y 
srl, thilF' proAB + lacl°-ZAM15} (Life Technologies, 
Gaithersburg, MD) was used for transformation of ligation 
mixtures and DNA propagation. 

Difco media components were from Becton-Dickinson 
(Franklin Lakes, NJ). Agarose GTG was from FMC 
(Rockland, MA). Phosphocellulose Pll resin was from 
Whatman (Springfield Mill, UK). Hydroxy apatite HTP was 
from Bio-Rad (Hercules, CA). Other chromatographic resins 
were from Pharmacia Biotech AB (Uppsala, Sweden). 
Immobilized TPCK-trypsin was from Pierce (Rockford, IL). 
All other reagents were from Amresco (Solon, OH) or Sigma- 
Aldrich (St Louis, MO), of the highest available purity. 

Cloning vector pTZ18U (Ap R , MCS,/1 ori, T7 promoter) 

(31) , was obtained from Dr David Mead, Molecular Biology 
Inc., WI. Plasmids pBR322, pUC19 and pACYC184, mini- 
prep DNA purification kit, Smal endonuclease, T4 DNA 
ligase, T4 DNA polymerase and lambda DNA were from 
EURx Ltd (Gdansk, Poland). 

TspDTI purification 

The TspDTI restriction endonuclease was isolated using the 
following stages, performed at 4°C. 

Polyethyleneimine (PE1) removal of nucleic acids. Thermus 
sp. DT cells were resuspended in buffer A [50 mM Tris-HCl 
pH 7.0, 100 mM NaCl, 5 mM EDTA, 10% glycerol, 5 mM 
P-mercaptoethanol (pME), 0.1% Triton X-100, 1 mM PMSF 
and 20 |ig/ml benzamidine] and lysozyme was added to a 
concentration of 1 mg/ml. The suspension was incubated for 
1 h and sonicated. Bacterial debris was spun down, the NaCl 
concentration was increased to 200 mM and PEI (pH 7.0) was 
added to 0.4%. The suspension was stirred for 1 h at 4°C and 
the nucleic acid-PEI complex was removed by centrifugation. 

Ammonium sulphate (AmS) fractionation. The PEI supernatant 
was adjusted to 30% AmS saturation and stirred for 2 h. 
Precipitated contaminating proteins were removed by cen- 
trifugation and AmS was added to the supernatant to increase 
its concentration to 60% saturation. The protein fraction from 
the 30-60% AmS precipitation was spun down and the 
supernatant was discarded. 

Phosphocellulose chromatography. Peletted TspDTI was 
dissolved in buffer B (20 mM KP0 4 pH 7.0, 30 mM NaCl, 
0.5 mM EDTA, 5% glycerol, 5 mM pME, 1 mM PMSF, 
20 |Xg/ml benzamidine), dialysed against buffer B and 
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adsorbed into phosphocellulose PI 1 . The column was washed 
with buffer B and eluted with a gradient of 30 mM to 1 M 
NaCl in buffer B. 

Heparin-agarose chromatography. Fractions containing 
restriction activity were dialysed against buffer C (20 mM 
Tris-HCl pH 7.5, 30 mM NaCl, 0.5 mM EDTA, 5% glycerol, 
5 mM PME) and applied to a Heparin-agarose column. The 
column was washed with buffer C and TspDTI was eluted 
with a 30 mM to 1 M NaCl gradient in buffer C. 

DEAE-Sephadex chromatography. TspDTI was dialysed 
against buffer D (20 mM Tris-HCl pH 7.5, 70 mM NaCl, 
0.5 mM EDTA, 5% glycerol, 5 mM pME), applied to a 
DEAE-Sephadex column, which was eluted with buffer D. 

Molecular sieving on Sephadex G-120. Active fractions from 
DEAE-Sephadex were concentrated to 3 ml and subjected to 
molecular sieving on a Sephadex G-120 column, equilibrated 
in buffer E [20 mM Tris-HCl pH 8.3, 3 mM MgCl 2 25 mM 
(NH 4 ) 2 S0 4 , 25 mM KC1, 0.5 mM DTT, 5% glycerol]. 

Hydroxy apatite chromatography. The enzyme was further 
dialysed against buffer F (20 mM KP0 4 pH 7.0, 30 mM NaCl, 
0.1 mM EDTA, 5% glycerol, 5 mM PME), adsorbed to a 
Hydroxyapatite HTP column, washed with buffer F and eluted 
with a 20-900 mM KPO4 pH 7.0 gradient in buffer F. Purified, 
homogeneous TspDTI (Fig. 2) was dialysed against buffer G 
[20 mM Tris-HCl pH 7.5, 3 mM MgCl 2 , 25 mM (NH 4 ) 2 S0 4 , 

2 mM DTT, 50% glycerol] and stored at -20°C. 

Determination of TspDTI recognition and cleavage sites 

The TspDTI recognition site and cleavage positions were 
established by shotgun cloning and sequencing of the partial 
digestion products of bacteriophage lambda DNA. The 
TspDTI-generated restriction fragment ends were blunted 
with T4 DNA polymerase in the presence of dNTPs (30), 
cloned into the Smal site of a modified pTZ18u vector (31), 
transformed into Exoli DH1 IS and plated onto X-Gal/IPTG 
plates (30). Miniprep plasmid DNA was isolated from white 
colonies and the fragment-vector junctions were sequenced 
using the ABI Prism 310 automated sequencer with ABI Prism 
BigDye Terminator Cycle Sequencing Ready Reaction Kit 
(Perkin Elmer Applied Biosystems, Foster City, CA). The 
obtained sequence data were then analysed using ABI 
Chromas 1.45 software (Perkin Elmer Applied Biosystems) 
and DNASIS 2.5 software (Hitachi Software, San Bruno, CA). 

Proteolysis of TspGWI and amino acid sequence 
determination 

Purified TspGWI was subjected to limited TPCK-trypsin 
digestion to obtain internal polypeptides for N-terminal amino 
acid sequencing. Proteolysis of 30 |Lig TspGWI was conducted 
in 110 nl of buffer T (20 mM Tris-HCl pH 8.3, 25 mM KCI, 

3 mM MgCl 2 , 5% glycerol, 0.05% Tween 20, 0.5 mM DTT) 
with 30 |Hl gel-immobilized TPCK-trypsin and gentle shaking 
at 24°C for 3 h. The immobilized TPCK-trypsin was removed 
by centrifugation. The supernatant, containing TspGWI 
fragments was run on a 6% SDS/PAGE denaturing gel and 
electroblotted onto PVDF membrane in 100 mM CAPS- 
NaOH buffer pH 1 1 .0. The N-terminal amino acid sequence 
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Figure 1. Partial digestion of pUC19 plasmid DNA with TspDTI restriction 
endonuclease. (A) TspDTI cleavage of pUC19 DNA, 1.5% agarose/TAE. 
Lane 1, 1 kb ladder; lane 2, 100 bp ladder; lane 3, untreated pUC19 DNA; 
lane 4, TspDTI-cut pUC19 DNA. (B) TspDTI cleavage of pUC19 DNA, 
6% polyacrylamide/TBE. Lane 1, 100 bp ladder; lane 2, untreated pUC19 
DNA; lane 3, TspDTI-cut pUC19 DNA. The smallest partial digestion band 
of 376 bp is indicated in bold italics with horizontal arrow. 



analysis of blotted polypeptides was performed on a gas-phase 
sequencer (Model 491, Perkin Elmer- Applied Biosystems). 
The phenylthiohydantoin derivatives were analysed by online 
gradient high performance liquid chromatography on 
Microgradient Delivery System Model 140C equipped with 
Programmable Absorbance Detector Model 785A and Procise 
software (Perkin Elmer-Applied Biosystems). 



RESULTS AND DISCUSSION 

Purification and properties of Thermus class-IIS 
endonucleases 

TspDTI activity is present in very small quantities in its 
natural host Thermus DT strain, thus the development of an 
extensive isolation procedure was essential. Seven purification 
stages were needed to obtain a homogeneous protein: PEI and 
AmS fractionations, followed by five chromatographic steps. 

The optimum reaction conditions for TspDTI are in 10 mM 
Tris-HCl pH 8.0 at 25°C, 10 mM MgCl 2 , 10 mM DTT. The 
temperature activity range extends from 42 to 85°C, with 
maximum activity observed at 65-75°C. Under all digestion 
conditions tested, a stable partial cleavage pattern was 
observed (Fig. 1). Spermidine does not affect TspDTI activity, 
while SAM stimulates the enzyme several-fold. In the 
presence of SAM and without Mg 2+ the enzyme methylates 
its recognition site, which becomes resistant to TspDTI 
cleavage upon the subsequent addition of Mg 2+ (data not 
shown). TspDTI can be inactivated by 20 min incubation at 
89°C. 

Determination of TspDTI recognition and cleavage sites 

The TspDTI cleavage pattern of pUC19, pBR322, pACYC184 
and lambda DNA indicated a high frequency of cleavage. The 
digested plasmid DNAs were run on an agarose gel (Fig. 1) 
and compared with the digestion patterns of known restriction 
endonucleases. The comparison suggested that TspDTI is an 
enzyme with a novel specificity. However, even repeated 
cleavage with concentrated TspDTI preparations failed to 
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Table 1. 

positions 
fragments 



Determination of the TspDTI recognition sequence and cleavage 
by shotgun cloning and sequencing of TspDTI restriction 



Position 
in lambda 



26-61 



7207- 
7242 



14406- 
14439 



15044- 
15009* 



19775- 
19810 



19910- 
19877* 



22821- 
22856 



22952- 
22987 



23574- 
23539* 



24762- 
24797 



25085- 
25120 



25300* 



26326- 
26291* 



26515- 
26482* 



DNA sequences flanking TspDTI cleavage sites In bacteriophage lambda 
DNA. 

Bold, non-italics - a terminal portion of TspDTI-cut restriction fragment 

cloned into pTZ18u derivative; 
Regular, italics - not cloned bacteriophage lambda DNA sequence 
adjacent to cloned TspDTI-derived restriction fragment 



5V.TA7T MTG>U A47T7TCCGGrTTAAGGCGTTTCCGT...-3' 



^ , -...7TCACTCAGC>MCCC^CGGTATCA q;TTCAt CCAGC.■.-3 , 



5*-... GAAAG&TGAACTGA 7TGCCCGTCTCCGCTCGCTGGG...-3' 



5 , :..GGTGGCTGGTCTGCCj3GGGGACG ATTCAt AAGTT...-y 



S , '..ACAA ^TGM TTACAGCGCCATCAGGCAGAGTCTCA...-Z' 



5'-...rCCG( 



q]atgM gccgggcgtta< 



iCAGCATGGATGTGGA...-3' 



5\..A7TCAGCGTCCCCG(|7TGTGAAT<trTCA7lACACG.„.3' 



..TTTC faTGAA ATACATTTn^Al 



TTATTATTTGAATC...-3' 



S'-.^TCCA A^ATGA^ GCCATAGGCATTrGTr 



5'-..t^ca 4^ga4 4gtatgttaaJcattggtataaaaaa..-3' 



5^-..^T( ^4TGA^ G>^GCrcrGTGTTTGTCTTCCTGCCTC...-3 , 



r 



S'-.-ACCT GiATGaAa CAA G CA TG TCATCGTAATATGTTCT...-3' 



..ATGCAGA TAAA TG TTAG AAAT AA*^TC^TACTC...-3' 
..'AAAC AWi^AA^ TATTTTTTCT^G 



TGAAAATAATAGACT...-3' 



S-...GCCAGAA TTG TCAGATTTCCACT tTTCATt TTAAT...-3' 



Table 1. Continued 



26605- 
26570* 



33084- 
33117 



33277- 
33242* 



35664- 
35699 



42678- 
42711 



42845. 
42812* 



43176- 
43209 



43336- 
43303* 



44616- 
44651 



44951- 
44984 



44984. 
44949* 



45214- 
45181* 



46867- 
46902 



47135- 
47100* 



5'-... TGAA AaTGAA aGCG rCCTTA4CACCTCATTACTTAG...-3 , 



..AGACGA TCC TGAA TG^TAATAAG C 0 (tTCAT] gGCTG ...-3' 



5'-...TGGC( 



dATGAA AAGA TGTTTCGTi 



GAAGCCGTCGACGC...-3* 



5 , ....7TTA7 £aTGA^ 7TTA7T7T7TGCAGGGGGGCA^ 



5'-...CA4GCCC^CAAGCCGTAAACGC(jnC^CAGAG....3* 



5'-. .. CACGCA TTGCTTG TG^AAT ATTG C GjTTCATt T AAAT...-3* 



,.CCATCGTCA4CG/^C(^CTCATG Q^TTCA^^ CGCGG■..-3■ 



..TCGAATCCA4TCGTAjrCCAGTTT G;TTCAT| CAGGT... 



5V..GTGAG2^^AAGAGGCGGCG^TTACTACCGATTCC...-3' 



..GCCGT4GCCACTGT(jTGTCCTGAA5TC^TAGTA..-3' 



5 t -...TACTA^^TTCAGGACAGACAGJGGCTACGGC7C...-y 



5^..jAACAGGTCATG7T7TTCTGGCAT({nc3GTCTT...-3' 



5''...GTGC( fATGAA TCGTCA TTG T4TTCCC GGATT AACTA...-3' 



5'-... TA TC CATGAA CA TAAAA GA T4TTACTATACCTTTGA....3' 



♦Sequence obtained from TspDTI restriction fragments cloned in reverse 
complement orientation. Base numbering refer to the conventional 
orientation of lambda genome; bold text, a terminal portion of TspDTI-cut 
restriction fragment, T4 DNA polymerase repaired and cloned into pTZ18u 
derivative; italic text, not cloned bacteriophage lambda DNA sequence 
adjacent to cloned TspDTI-derived restriction fragment; box + horizontal 
arrow, TspDTI recognition sequence; vertical arrows, TspDTI cleavage 
positions. 



yield a complete reaction, resulting in a stable partial cleavage 
pattern, with -50% of the substrate DNA converted to 
complete digestion bands (Fig. 1). Thus, the inability to 
completely cut substrate DNA is either an intrinsic feature of 
the enzyme or a key cofactor was missing in the reaction. 

Comparison of the 35 junction sequences (29 are shown in 
Table 1) indicated that TspDTI belongs to the class-HS 
restriction endonucleases, since a putative non-palindromic 
recognition site was found in the cloned inserts at a constant 
distance from the vector junction. TspDTI is a novel prototype 
of restriction specificity. The enzyme recognizes a 5 bp 
asymmetric cognate site (boxed) and cleaves DNA down- 
stream, after nt 1 1 and 9 (vertical arrows) in the top and the 
bottom strand, respectively, yielding 2 nt 3' single-stranded 
termini: 



5'- 
3'- 



AT6AA 
T A C T T 



NNNNNNNNNNNuNN N-3' 
NNNNNNNNN 9 NNNNN-5' 

T 



A computer prediction of the cleavage frequency shows that 
there are four TspDTI sites within pUC19, 10 within pBR322, 
1 1 within pACYC184 and 176 within lambda DNA. However, 



the actual TspDTI digestion pattern exhibits more bands than 
expected, due to partial cleavage (Fig. 1). 

Evolutionary implications 

Both the recognition sequence of TspDTI and its cleavage 
positions appear to be related to those reported by us 
previously for the Thermus sp. restriction endonuclease 
TspGWI, 5'- ACGGA(N 11/9 )-3' (32). The TspDTI cognate 
site has only two differences as compared to the 5 bp 
recognition site of TspGWI: a transition from C to T and from 
G to A in the second and fourth bases, respectively. Moreover, 
the cleavage positions at nt 1 1 and 9 are exactly the same for 
both enzymes. Since both TspDTI and TspGWI originate from 
two Thermus sp. isolates found in the same hot spring sample, 
it is possible that both enzymes occurred as a result of a recent 
divergent evolution event. In addition, TspDTI and TspGWI 
might be closely related to two previously reported class-IIS 
endonucleases found in different Thermus species: Taqll from 
Thermus aquaticus (33) and Tthl 1 IE from Thermus thermo- 
philus (34), and more distantly related to mesophilic 
endonucleases: Ecil from E.coli (REBASE: rebase.neb.com) 
and BceAI and Bcefl, both from Bacillus cereus (35) 
(REBASE: rebase.neb.com) (Table 2). Both Taqll and 
Tthl 1111 recognize asymmetric 6 bp redundant sites and 
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Figure 2. Determination of polypeptide molecular sizes for Thermits class-IIS endonuc leases: TspDTI, TspGWI, Taqll and TthllllT. (A) SDS/PAGE of 
purified, homogeneous TspDTI, TspGWI, Taqll and Tthlllll endonucleases. Lane Ml, protein marker broad range (New England Biolabs); lane M2, low 
molecular weight marker (Amersham-Pharmacia). Bands marked in lanes Ml and M2 are as follows: 158.2 kDa, MBP-p-galactosidase; 116.4 kDa, 
P-galactosidase; 97.2 kDa, phosphorylase b; 66.4 kDa, bovine serum albumin. Lane I, TspGWI endonuclease; lane 2, TspDTI endonuclease; lane 3, Tthl 1 111 
endonuclease; lane 4, Taqll endonuclease. (B) Graph showing estimation of polypeptide sizes for TspDTI, TspGWI, Taqll and Tthl 1 III. 



cleave 1 1 and 9 nt downstream: 5'-GACCGA(N! , /9 )-3' or 5'- 
CACCCA(N 1I/9 )-3 , (33) and 5'-CAARCA(N 11/9 )-3' (34), 
respectively. Due to a redundancy of Taqll and Tthl 1 in 
6 bp recognition sites, their overall cleavage frequency is only 
slightly lower than that of TspDTI and TspGWI 5 bp non- 
redundant sites. One of several possible evolutionary scen- 
arios would be that TspGWI and TspDTI have eliminated the 
first C or G from an ancestral Taqll/Tthl 1 IE-like recognition 
sequence, thus evolving toward more frequent restriction. As 
illustrated in Table 2, TspGWI and TspDTI 5 bp cognate sites 
show 2-3 differences, when compared to those of Taqll and 
Tthl 1 III. All these changes are located within a variable 3 bp 
5 / -TGA-3 / 'core* region (bases 2-4) of the TspDTI recogni- 
tion site 5'ATGAA-3'. The first and the last A residues (bases 
1 and 5) of the TspGWI and TspDTI recognition sites remain 
conserved amongst all four enzymes. In addition, one of the 
two possible variants of Tthl 1 III cognate sites, 5'-CAAGCA- 
3', also shares an internal G residue (base 4) with the TspDTI 
site (Table 2). Recognition sequence similarities are further 
validated by strict conservation of cleavage positions at nt 11 
and 9 for TspGWI, TspDTI, Taqll and Tthl 1 III. The proposal 
for the existence of a class-IIS sub-family, within Thermus sp., 
is further reinforced by the following findings. 

(i) All the four putative related endonucleases have 
been purified to homogeneity from their native Thermus 
strains (A.Zylicz-Stachula, I.Sobolewski and P.M.Skowron, 
manuscript in preparation; S.M.Rutkowska, I.Jaworowska, 
I.Sobolewski and P.M.Skowron, manuscript in preparation). 
Strikingly, their respective polypeptides migrate to the same 
position on SDS/PAGE gels. Only prolonged electrophoresis 
allows for distinguishing subtle variations in their molecular 
sizes (kDa): TspDTI, 114.5 ± 7; TspGWI, 122.0 ± 7; Taqll, 
118.5 ± 7; and Tthlllll, 116.5 ± 7 (Fig. 2). Such large 
polypeptides are rare amongst prokaryotes, and to our 
knowledge they are the largest class-IIS restriction endo- 
nucleases known to date. 



(ii) TspDTI-containing fractions eluted from the Sephadex 
G-120 column, used for the enzyme purification, show a 
homogeneous protein band, with a relative molecular weight 
of -120 ± 10 kDa on SDS/PAGE denaturing gels. Relative 
band intensities in consecutive column fractions correlate 
perfectly with the restriction activity peak of TspDTI (data 
not shown). Subsequent comparison of the elution profiles of 
TspDTI, TspGWI, Taqll and Tthlllll from a molecular 
sieving column shows that all the peaks of activity appear at 
nearly identical positions, characteristic of large proteins of 
native molecular size of 110-130 kDa (Table 2) (S.M. 
Rutkowska, I.Jaworowska, I.Sobolewski and P.M.Skowron, 
manuscript in preparation). This indicates the same, mono- 
melic structure for TspDTI, TspGWI, Taqll and Tthl 1 III. 

(iii) Taqll restriction endonuclease has been cloned, 
expressed and purified in our laboratory (36). The recombinant 
Taqll exhibits the same molecular size as the native Taqll, 
which matches sequencing/genetic analysis data obtained for 
the taqllR gene: 3315 bp/1105 aa/125.6 kDa (36; S.M. 
Rutkowska, I.Jaworowska, I.Sobolewski and P.M.Skowron, 
manuscript in preparation). 

(iv) TspGWI has been subjected to partial trypsin digestion 
and the N-terminal amino acid sequences for two internal 
peptides have been determined. Comparison between two 
TspGWI tryptic fragments and the complete Taqll endo- 
nuclease amino acid sequence revealed near perfect homol- 
ogy: in peptide 1, an 8 aa continuous region contains seven 
identical amino acids and a single conservative substitution 
(100% similarity), while in peptide 2, a 9 aa region contains 
eight identical amino acids (88.9% similarity) (Fig. 3) 
(36; S.M.Rutkowska, I.Jaworowska, I.Sobolewski and 
P.M.Skowron, manuscript in preparation). 

(v) Both TspDTI and the Taqll are capable of specific 
methylation of their recognition sites in the presence of SAM 
(Table 2) (A.Zylicz-Stachula, I.Sobolewski and P.M.Skowron, 
manuscript in preparation). Some data suggest that TspGWI 
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Table 2. Comparison between TspDTT/TspGWI endonuclease 'twins' and class-IIS restriction endonucleases with related recognition and cleavage sites 



Restriction 


Bacterial host a 


Recognition 


Cleavage 


Reaction 


Polypeptide size b 


Native molecular 


Specific DNA 


Reference 


endonuclease 0 




site 8 


positions" 


temperature 3 




size c 


methylation 




TspDTI 


Thermus sp. 


ATGAA 


N11/9 


70°C 


114.5 kDa 


110-130 kDa 


+++ 


This work 


TspGWI 


Thermus sp. 


ACGGA 


Nn/9 


70°C 


122.0 kDa 


110-130 kDa 


+/- 


(32), This work 


Taqll 


Thermus aquaticus 


GACCGA 


Nji/9 


70°C 


118.5 kDa 


110-130 kDa 


++++ 


(33), (36), This work 






CACCCA 


N11/9 




(125.6 kDa d ) 








Tthlllll 


Thermus thermophilus 


CAARCA 


N11/9 


70°C 


116.5 kDa 


110-130 kDa 


ND 


(34), This work 


Ecil 


Escherichia coli 


GGCGGA 


Nll/9 


37°C 


ND 


ND 


ND 


REBASE 


Been 


Bacillus cereus 


ACGGC 


Ni2/I3 


30°C 


ND 


ND 


ND 


(35) 


BceAI 


Bacillus cereus 


ACGGC 


Ni2/14 


30°C 


ND 


ND 


ND 


REBASE 



^Thermus sp.-derived and TspDTI/TspGWT-related endonucleases, bases in a recognition sequence, cleavage positions and reaction temperatures are marked 
in bold. 

b As estimated by SDS/PAGE of homogeneous proteins. 

c As estimated by molecular sieving under native buffer conditions. 

d As calculated from sequencing/genetic analysis data obtained for the taqllR gene. 

ND, not determined. 



has residual methylation activity (data not shown). Analysis of 
the cloned taqllR gene (36; S.M.Rutkowska, IJaworowska, 
I.Sobolewski and P.M.Skowron, manuscript in preparation) 
reveals that it is a fusion containing both an endonuclease 
moiety and a methyltransferase module with FGG and DPPY 
motifs (37). 

Taken together, the near identity of very atypical molecular 
sizes, substantial similarities in recognition sequences, iden- 
tity in cleavage sites, strong partial amino acid sequence 
homology between TspGWI and Taqll, cleavage stimulation 
by SAM, the presence of all four endonucleases in the same 
bacterial genus as well as biochemical similarities point to the 
common evolutionary origin of TspDTI, TspGWI, Taqll and 
Tthlllll, thus defining a sub-family of Thermus class-IIS 
enzymes. These endonucleases are characterized by a unique 
combination of features found only in the Thermus class-IIS 
sub-family (such as extremely large polypeptides), as well as 
those present in class-IIS (asymmetric cognate sequence, 
cleavage outside recognition site) and sub-class IV (SAM 
stimulation, endonuclease-methyltransferase genes fusion). 
Moreover, considering SAM dependence, the Thermus class- 
IIS sub-family enzymes show continuity between class-IIS 
and class-IV features from barely detectable (TspGWI) to 
strong stimulation (Taqll) (data not shown). 

Three more class-IIS endonucleases from unrelated 
bacterial species exhibit marked similarities to the 
putative Thermus sp. class-IIS endonuclease family. The 
mesophilic Ecil endonuclease, from E.coli, recognizes 
5'-GGCGGA(N u/9 )-3' (REBASE: rebase.neb.com). The 
enzyme shares the last 4 bp out of the 5 bp of the TspGWI 
recognition site (bases 2-5) and has the same cleavage 
positions of N U /9. Moreover, Bcefl and BceAI, mesophilic 
endonucleases isolated from B .cereus, both recognize the 
5'-ACGGC-3' cognate site. Their recognition sequence differs 
in only one bp (last base, 5) from the TspGWI recognition 
sequence. However, Bcefl and BceAI have cleavage positions 
shifted further downstream: 1 nt in the top strand and 4 or 5 nt 
in the bottom strand to N J2 /i3 or N12/14, respectively (35; 
REBASE: rebase.neb.com). 

In general, homologies between restriction endonucleases 
are very rare. They are usually limited to the epitopes, such as 
the highly variable catalytic motif PDX I0 _ 30 (D/E)XK (38,39). 
Amongst class-IIS endonucleases, both primary sequence and 



30 37 445 453 

i II I 

Taqll P E A Q L V P L FFYERLAE E 

11111:11 I I I I I I I I 
TspGWI P E A Q L I P L FFYERLAQ E 

Figure 3. Amino acid sequence comparison between two tryptic fragments 
of TspGWI endonuclease and complete Taqll endonuclease sequence. 
Identical amino acid residues are indicated by straight lines, and similar 
residues by dots. The partial amino acid sequence of TspGWI restriction 
endonuclease was obtained by limited trypsin digestion, followed by 
N-terminal protein sequencing of two internal peptides. The Taqll amino 
acid sequence was translated from the cloned taqllR coding gene (36; 
S.M.Rutkowska, IJaworowska, I.Sobolewski and P.M.Skowron, manuscript 
in preparation). The first homologous region extends from amino acid 30 to 
37 and the second region from amino acid 445 to 453 of Taqll restriction 
endonuclease. 

structural homology were reported and studied in detail for 
imperfect isoschizomers Fokl and StsI (15,16). Both enzymes 
bind to the 5 bp asymmetric site 5'-GGATG-3' and cleave 9/13 
(Fokl) or 10/14 (StsI) nt downstream. Remarkably, even 
though the enzymes exhibit relatively high amino acid 
sequence homology (30%), they are very distinct biochemic- 
ally: StsI is an acidic protein (pi 6.3), while Fokl is very basic 
(pi 9.4), they do not cross-react immunologically and they 
have different reaction optima. Nevertheless, they share a 
common domain organization, suggesting very close simil- 
arities in their mechanism of action (5,7-10,15-18). In 
contrast, two other neoschizomers belonging to class-H, 
Smal and Xmal, show no homology between their amino 
acid sequences. Both recognize 5'-CCCGGG-3' sites, how- 
ever, they leave blunt or 4 nt sticky ends, respectively. 
Apparently, the mechanism of recognition is different, as they 
bend the DNA helix in opposite orientations (40). Very few 
other cases of known homologous endonucleases in class-II 
are limited to isoschizomers, they are listed below. 

(i) EcoRI/RsrI/MunI share 18-50% homologous amino 
acids and common active site architecture with Xcyl and Cfr9I 
(39,41); (ii) XmaI/XcyI/Cfr9I are highly homologous, exhibit- 
ing 80% homology between Cfr9I and Xmal/Xcyl (41); 
(iii) Aval/BsoBI pair, the Aval from cyanobacteria Anabaena 
variabilis has a thermophilic counterpart from a distant 
species, Bacillus stearothermophilus. Nevertheless, the 
enzymes show 55% homology and possess common amino 
acid residues critical for catalytic activities (42,43); (iv) BsuRI 
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from prokaryotic Bacillus sphaericus and CviJI from IL-3A 
virus-infected eukaryotic Chlorella, in spite of the fact that 
they originate from two separate kingdoms, still exhibit 1 1 .6% 
homology. Moreover, they show a substantial epitopic simil- 
arity (35%) over the 132 amino acids region (25-27,44); 
(v) TaqI isochizomers group contains eight related isoschizo- 
mers, with homology ranging from 54 to 100%, which was 
correlated with the geographical location of their Thermus 
host strains (45). Moreover, differences in their amino acid 
sequences allowed for an insight into observed varied 
thermostability (45,46). The corresponding methyltrans- 
ferases are remarkably similar, indicating that both the 
endonuclease and the methyltransferase components of 
Taql-related restriction-modification systems (RMs) were 
evolving as linked genes, in contrast to EcoRI and RsrI 
systems (47); (vi) BsuFI and Mspl share 45% overall amino 
acid sequence similarity, including three smaller regions with 
60% identity. Interestingly, in spite of their close enzyme 
relatedness, the msplRM genes have a divergent arrangement, 
while bsuFIRM genes have convergent organization (48); 
(vii) EcoHK31I and Eael share 92% identity; their corres- 
ponding methyltranferases are both composed of two homo- 
logous subunits, a and (3, regulated by the same alternative 
open reading frame mechanism. Moreover, some evidence 
shows that EcoHK31I and Eael RMs were subjected to 
intergenic tranfer (49); (viii) BsuBI and PstI share 46% amino 
acid identity. The RMs have different genetic organization: 
pstlRM genes are transcribed divergently, while bsuBIRM are 
arranged in head-to-tail orientation, with bsuBIM preceding 
bsuBlR (50). 

According to the above examples, both primary sequences 
of endonuclease and methyltransferase coding genes, within 
related RMs, as well as the gene organization, are subjected to 
intense evolutionary pressure. This results in a high evolu- 
tionary rate and variability, even amongst related RMs, where 
methyltransferase can evolve either separately or together 
with an endonuclease component, or even be horizontally 
transferred amongst different bacterial species (49). 

Whether a homology of recognition and cleavage sites of 
TspDTI, TspGWI, TaqD and TthlllH (possibly including 
Ecil, Bcefl and BceAI as well) is mirrored by the similarity of 
biochemical properties, homology and organization of their 
coding genes and amino acid sequences, remains to be further 
evaluated. 
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ABSTRACT 

A novel prototype class-IIS restriction endonuclease, 
Tsp GWI, was isolated from the thermophilic bacterium 
Thermus sp. GW. The recognition sequence and 
cleavage positions have been established: TspGWI 
recognizes the non-palindromic 5-bp sequence 
5-ACGGA-3' and cleaves the DNA 11 and 9 nt down- 
stream in the top and bottom strand, respectively. In 
addition, an accompanying endonuclease, TspGWII, an 
isoschizomer of Pst\, was found in Thermus sp. GW 
cells. 

RESULTS AND DISCUSSION 

The general characteristics of type II restriction endonucleases 
are that they require Mg 2+ ions as the only obligatory co-factor 
and recognize a palindromic 4-8-bp cognate site, where 
cleavage takes place at precisely defined positions within the 
recognition site (1). The type II enzymes are higly hetero- 
geneous, with a distinct subclass, IIS, which differs from the 
class-Il paradigm by its ability to recognize asymmetric 
sequences and cleave further downstream at defined distances 
of 1-20 nt (2). Currently there are 73 known prototypes of 
endonucleases that recognize asymmetric sites (1). However, 
only 43 of them meet the criteria of class-IIS. The remaining 
30 enzymes either (i) cleave within asymmetric sites, (ii) have 
not yet had their cleavage positions determined or (iii) cleave 
on both sides outside their recognition sites (1). 

Thermus sp. GW was cultivated aerobically at 60°C in a 
modified Luria broth (0.5% tryptose, 0.3% yeast extract, 0.2% 
NaCl, pH 7.2) supplemented with trace elements. Since class-IIS 
restriction endonuclease TspGWl activity is only present in 
minute quantities in the host strain, an extensive purification 
was required to obtain a sufficient amount of enzyme for 
further analysis. The purification procedure included (i) 0.4% 
polyethyleneimine removal of nucleic acids from the crude 
extract in the presence of 100 mM NaCl, (ii) ammonium sulfate 
30-50% saturation cut, (iii) phosphocellulose chromatography, 
(iv) heparin-agarose chromatography, (v) Cibacron blue-agarose 
chromatography and (vi) DEAE-Sephadex chromatography. 
The purified preparation was free from non-specific nucleolytic 



activities. In the course of the purification of 7s/?GWI an 
accompanying restriction endonuclease, 7s/?GWII, a thermophilic 
isoschizomer of Pstl, was found. TspGWII recognizes 5'-CTG- 
CAG-3' and cleaves after the A residue, leaving 4 nt 3'-protruding 
ends (not shown). 

The TspGWl recognition site was established using two 
procedures: (i) assessment of the digestion pattern on pUC19, 
pACYC184 and pBR322 DNAs and (ii) cloning and 
sequencing of 7spGWI restriction fragments of pBR322 and 
bacteriophage lambda DNAs (Table 1). The analysis of the 
cleavage pattern and the mapping of TspGV/l sites present in 
pUC19 (Fig. 1A and B) suggested that this recognition site is 
5'-ACGGA-3'. There are two such sites in pUC19, four in 
pACYC184 and five in pBR322. Cleavage of pUC19 with 
TspGWl resulted in a complete digestion. However, cleavage 
of pBR322 and pACYC184 DNAs showed a stable partial 
digestion pattern, where all sites except one are efficiently cleaved 
(Fig. 1C-E). The refractory sites, cleaved at a substantially slower 
rate, were mapped and they turned out to be located within the 
tetracycline resistance gene, present in both plasmids. One of the 
possible explanations for the phenomenon of refractory cleavage 
might be that it is an effect of the immediate sequence segment 
surrounding the TspGV/1 site. The neighboring bases, which 
are present on both sides of the TspGWl recognition 
sequences in pBR322 and pACYC184, are shown in bold: 
5'-AACGGAT-3'. The putative TspGV/I cognate site was 
further investigated through the digestion of pBR322 DNA and 
lambda DNA with TspGV/1, followed by repair of the 7jpGWI 
restriction fragment termini with T4 DNA polymerase/dNTPs 
(3) and by cloning the resulting restriction fragments into the 
Smal site of a modified pTZ18u vector (4). The vector-insert 
junctions of the resulting clones were sequenced using the ABI 
Prism 310 automated sequencer with ABI Prism BigDye 
Terminator Cycle Sequencing Ready Reaction kit and then 
analyzed using ABI Sequencing Analysis 3.4.1 software 
(Applied Biosystems, Foster City, CA) and Hitachi DNASIS 
2.5 software (Hitachi Software Engineering Co., Yokohamie, 
Japan). A total of 58 junction sequences were compared. 
Twenty-five junctions derived from bacteriophage lambda 
cloned restriction fragments are exemplified in Table 1, where 
eight junctions are for the top strand of the 5'-ACGGA-3' and 17 
for the bottom strand of the 5'-TCCGT-3' reverse complement 
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Table 1. Determination of the 7$pGWI recognition sequence and cleavage positions by shot-gun cloning and sequencing of TspGWl restriction fragments 



Sequence position in 


DNA sequence flanking TspGWI cleavage site in bacteriophage 


lambda genome. 


lambda DNA. 


152-185 


5'- ...TGTCGTTTCCTTTCTjCTG II 1 1 IGlTCCGTl GGAAT...-3' 


1064-1097 


$\..CGTTGAGCCGACTA 7TCGTGATATI TCCGT] CGCTG...-3' 


2189-2224 


5V..CG7C7|ACGGA|AAGCCGG7GGCCAGCATGCCACGTAA...-3' 


2189-2222* 


5'-..^CGrGGC^rGCrGG^CACCGGCTT|TCCGTlAGACG...-3' 


3869-3904 


5'-..AGGATbCGGA\TAACGGCTAckcGTGTTTGAGCAGT...~y 


4292-4325 


5'- ... A7C4GGAAA7T7TTG^CCAGCAGGlTCCGTlGAAAC...-3^ 


4615-4650 


5'-. . . TGA TG V\CGGA\ CCA CGACAGG CCCGCAGTTATCAGGT. . .-3' 


10779-10814* 


5'-. ,.CAGCC[ACGGA\ CTTTGCCCGCC^TGCAAGCTGCGTGGC.. .-3' 


10781-10814 


5'-. CACGCAGCTTGCAGG^GGGCAAAG\1CCG1\ GGCTG...-3* 


1 2562-1 2597* 


5'-. . A CA GC \ACGGA\ ACGGG TGAA GCTGCGCC AGTTCTGCT. . . -3' 


16454-16489 


5'- ...GCGGC $C~G~Ba1 GCCGCGC/A TCi^JCTGTAATGCGTACC.. .-3* 


17205-17240* 


5*- ...TTCTC\ACGGA\ TACrCACGCAG^CGGAACAGTCGCTGG...-3' 


19472-19505 


5'-. ..GGTA TACA GA TTAA TjCCGGCAGCG [TCCGTl CGTTG ... -3* 


19546-19579* 


5^...rTTArA4CCG>^CCCC^ACGATGAA|TCCGTlCAGTA...-3 , 


19924-19958 


5' -...CCA TG [4 CGGAl GGATGA TGCCCGGCCGGAGGTGCTG...3' 


19977-20010 


5 , -...GTGGAAGAGGrGGCG^CGTAACGCGlTCCG7lGGTGG...-3' 


19975-20010* 


5'- ...CCACC[ACGGAl CGCGTT4CGCG^CCACCTCTTCCACCA...-3' 


21994-22029 


5'-. . . CAACC \ACGGA\ CCA TAAAAA tAaTAATCTGCTGGCC. . .-3' 


22319-22352* 


5»-...CGGA TCCGGA4 CAGTjrTTTCTGCTlTCCGTl ATCCT...-3* 


30971-31004 


5^...A4GCCCGCrGCCAGAAAAATGCAT^TCCGTlGGTTG...-3 , 


31291-31324* 


5'-...G7*AGGCGCA4 TCA C TjTTCGTCT AC {TCCGTlTACAA. . .-3* 


32664-32631 * 


5'-. . . CGTTGCCAA CCA G 7"ACGGCCTT AAlTCCGTl GGACG ... -3' 
t 


33986-34019 


5'-. . . CTGTTTAGTTA CGAG^G ACATTGQ TCCGTl GTATT. , .-3' 


34316-34283* 


5 , -...rGrG/^7GGA4CAArACCAGGACTA|7CCG7lATGAC...-3 , 


43565-43598 


5'-... TCTTGCCCA TAAAGC^VGATGAACTlTCCGfl TAATC...-3' 


44551-44518* 


5*-...rArCA7GCCG7TA47^TGTTGCCA[^CGBGGCAA...-3' 



Base numbering refers to the conventional orientation of the lambda genome. Bold, a terminal portion of 7spGWI-cut restriction fragment, T4 DNA polymerase 
repaired and cloned into pTZ18u derivative; italic, uncloned bacteriophage lambda DNA sequence adjacent to cloned 7spGWI-derived restriction fragment; box 
with horizontal arrow, TspGWl recognition sequence; vertical arrows, TspQWl cleavage positions. 
♦Sequence obtained from TspGWi restriction fragments cloned in reverse complement orientation. 
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Figure 1. Digestion pattern and site preference of TspG WI on pUC 1 9, pBR322 and pAC YC 1 84 plasmid DNAs. (A) TspGWl cleavage of pUC 1 9 DNA, 1 % agarose/TAE. 
Lane 1, 1 kb ladder; lane 2, 100 bp ladder; lane 3, untreated pUC19 DNA; lane 4, TspGWl-cut pUC19 DNA. (B) TspGWl cleavage of pUC19 DNA, 3.5% NuSieve 
GTG agarose/TAE. Lane 1, 100 bp ladder; lane 2, untreated pUC19 DNA; lane 3, 7jpGWl-cut pUC19 DNA. (C) cleavage of pBR322 DNA, 1% agarose/TAE. Lane 1, 
1 kb ladder; lane 2, 100 bp ladder; lane 3, untreated pBR322 DNA; lane 4, TspGWl-cut pBR322 DNA. Out of five T^GWI sites in pBR322, one 5'-AACGGAT-3' 
variant is cleaved inefficiently, resulting in fragments of 1527 bp (faint band, marked in bold) and 282 bp (not visible on reproduced picture, marked in bold and 
crossed). The partial digestion band of 1809 is indicated by bold italics and a horizontal arrow. (D) TspGWl cleavage of pBR322 DNA, 3.5% NuSieve GTG agarose/TAE. 
Lane 1, 100 bp ladder; lane 2, untreated pBR322 DNA; lane 3, T^GWI-cut pBR322 DNA. (E) TspGWl cleavage of pACYC 1 84 DNA, 1% agarose/TAE. Lane 1, 
1 kb ladder; lane 2, 100 bp ladder; lane 3, untreated pACYC184 DNA; lane 4, TspGWl-cut pACYC184 DNA. Bands of 2412 and 282 bp, where intensity is 
decreased due to slow cleavage rate of TspGWl site variant 5'-AACGGAT-3', are marked in bold. Partial digestion band of 2694 bp is marked in bold italics and 
a horizontal arrow. Selected band sizes of marker DNAs are shown on the left of each panel. 



orientation. Comparison of the junction sequences confirmed the 
presence of the 5'-ACGGA-3' site at either defined distances 
near the ends of a cloned restriction fragment or in pBR322/ 
lambda DNA regions which, prior to TspGWl cleavage, were 
continuous with their corresponding cloned fragment. The 
sequences presented in Table 1 were selected to show all found 
combinations of the neighboring 1 bp, flanking both sides of 
the TspGWl recognition sequences. The computer prediction 
revealed 107 5'-ACGGA-3' sites in lambda. Four out of the 16 
possible combinations of 1-bp neighborhoods are under-repre- 
sented in the lambda genome and they were also not detected 
during the junction sequence analysis (marked in bold): 5'- 
AACGGAT-3' (5/107), 5'-AACGGAC-3' (6/107), 5'- 
TACGGAC-3' (1/107) and 5'-TACGGAG-3' (0/107). Since the 
5'-AACGGAT-3' site present in the tetracycline resistance gene 
is slowly cleaved this may apply to lambda as well. This would 
decrease the cloning efficiency of 5'-AACGGAT-3' fragments. 
Whether the absence of 5'-AACGGAC-3' and 5'- 



TACGGAC-3' sequences amongst the analyzed junctions is 
also caused by diminished reaction efficiency remains to be 
determined. 

The TspGWl recognition sites from the sequenced junctions 
appear in four possible configurations: as top or bottom strand in 
either an insert or an uncloned segment of adjacent pBR322/lambda 
DNA regions (Table 1). The comparison of various clone 
configurations allowed the deduction of the cleavage positions, 
which are located further downstream from the 5'-ACGGA-3' 
sequence, at the 1 1 and the 9 nucleotide in the top and bottom 
strands (vertical arrows), respectively, leaving 2 nt 3'-over- 
hangs: 



5- ACGGA 



NNNNNNNNNNNnNNN-3' 



3'- TGCCT NNNNNNNNN^NNNNN -5' 
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The recognition sequence and cleavage points of TspGWl 
show similarities to those of the two known Thermus-derived 
restriction endonucleases of type-US: Taqll from Thermus 
aquaticus, GACCGA(N I1/9 )-3' or CACCCA(N n/9 )-3' (5), and 
Tthl 1 111 from Thermus (hemophilus 111, CAARCA(N, 1/9 )-3' 
(6). The conservation of cleavage positions can be observed, as 
well as of the first and the last A of the 7s/?GWl recognition 
site. This may indicate a common evolutionary origin for all 
these endonucleases. 

The optimum reaction conditions for TspGWl are 50 mM 
Tris-HCl pH 8.5 at 25°C, 10 mM MgCl 2 , 10 mM DTT and 
3 mM spermidine. The enzyme is active between 42 and 85°C, 
with a maximum at 65 and 75°C. The enzyme can be inactivated 
by 20 min incubation at 89°C. 
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